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A dynamical view of different solution 
paradigms in two-person symmetric games: 
Nash vs. co-action equilibria 

V. Sasidevan and Sitabhra Sinha 


Abstract The study of games and their equilibria is central to developing insights 
for understanding many socio-economic phenomena. Here we present a dynamical 
systems view of the equilibria of two-person, payoff-symmetric games. In particular, 
using this perspective, we discuss the differences between two solution concepts for 
such games - namely, those of Nash equilibrium and co-action equilibrium. For the 
Nash equilibrium, we show that the dynamical view can provide an equilibrium 
refinement, selecting one equilibrium among several possibilities, thereby solving 
the issue of multiple equilibria that appear in some games. We illustrate in detail 
this dynamical perspective by considering three well known 2-person games namely 
the Prisoner’s Dilemma, game of Chicken and the Stag-Hunt. We find that in all of 
these cases, co-action equilibria tends to correspond to ‘nicer’ strategies than those 
corresponding to Nash equilibria. 


1 Introduction 


Games represent strategic interactions between entities generally referred to as 
agents. Here, the term “agents” could refer to a variety of entities, ranging from 
human beings or animals to computer programs or robots. In games, each agent 
receives a payoff depending upon the strategy choice made by all agents includ¬ 
ing herself. Thus, an agent who wants to optimize her payoff should consider not 
only the payoff structure of the game, but also the decision making processes of 
other agents. The choice of strategy by each agent in such an interaction leads to 
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a collective outcome that may or may not be globally optimal. In this context, it is 
imperative to understand how two agents facing a game situation, who have to make 
a strategic decision, will go about doing it, since the strategic interaction between 
agents is the basis of the collective behavior in a system comprising such agents. Fi¬ 
nancial markets, for example, may be viewed as the collective outcome of strategic 
interactions between a large number of people participating in it. Another example 
is that of evolution, where one may view evolution by natural selection as a result 
of the interaction between competing genes. Cooperation and conflict is at the heart 
of such systems and forms the subject matter of the study of games. In games, in 
general, each agent should have a behavior model of other agents so that she has a 
way to describe the decision making process of other agents. In this regard, standard 
game theory makes several assumptions about the agent’s behavior. It assumes that 
agents are fully rational and would like to optimize their payoff and they are perfect 
in execution of their strategies (see for e.g m for a detailed discussion). While the 
applicability of these assumptions in any particular situation is open to criticism, 
they form an important benchmark for optimal behavior. In fact, these assumptions 
form an important part of modern economic theory in which the participating agents 
are often assumed to be fully rational. 

The simplest of games consists of the strategic interaction between two agents 
in a single play of the game. In fact, 2-person games like Prisoners Dilemma, Stag- 
Hunt etc., describe very general socio-economic scenarios, towards the analysis of 
which considerable effort has been devoted. A key concept in the study of games is 
that of an “equilibrium” . It refers to a state of affairs where each agent has decided 
her strategy for the game at hand. How the agents pick their equilibrium strategy is 
given by a solution concept. A solution concept thus is a formal rule for predicting 
how a game will be played between agents and employs certain assumptions regard¬ 
ing agent’s behavior. An important solution concept for non-cooperative games is 
that of Nash equilibrium. Informally, it is a state where after every agent has selected 
their ‘Nash’ strategies, none of the agents can improve their payoff by unilaterally 
deviating from it. It is to be noted that a game may have more than one Nash equi¬ 
librium. 

In this article, we show that the equilibria of a game may be viewed as the “fixed- 
point” equilibria of a dynamical system. In particular, we present a dynamical view 
of the equilibria obtained by two different solution concepts, viz., Nash HI and 
co-action an, the latter being a concept that makes use of the symmetry between 
the agents for payoff-symmetric games. The vector flow diagrams on the strategy 
space that is generated using the dynamics approach makes the differences between 
the equilibria obtained in the two solution concepts visually apparent. For the Nash 
equilibrium, we argue that a dynamical perspective may be regarded as an equilib¬ 
rium refinement selecting one equilibrium out of several possible ones, thus solving 
the multiplicity issue. We illustrate these points by considering three well known 
examples of 2-person games, namely the Prisoners Dilemma, Game of Chicken and 
Stag-Hunt. 
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Fig. 1 A generic representa¬ 
tion of the payoff matrix for 
a 2-person symmetric game 
where each agent has two 
actions available to her. 
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2 A dynamical framework for analyzing 2-person games 

Here we describe a dynamical perspective for analyzing games, focusing on 2- 
person single-stage games in which two agents interact only once. No communi¬ 
cation is allowed between the agents. Furthermore, we consider the simple case 
where each agent has to choose one of two possible actions (say. Action 1 and Ac¬ 
tion 2) available to her. Each agent receives a payoff according to the pair of choices 
made by them, such that the game may be represented by a payoff matrix that speci¬ 
fies all possible outcomes (Fig.[T]i. We consider situations where the game is payoff 
symmetric, i.e., on exchanging the identities of the players (A,B), the payoff matrix 
remains unchanged. Note that most 2-person games that are studied in the literature 
fulfil the above criteria. Given the payoff matrix, an agent can have a mixed strategy, 
where she chooses Action 1 with some probability p and Action 2 with probability 
(1 — p). If p is either 0 or 1, it is called a pure strategy. Given a game, represented by 
a matrix containing the numerical values of R,S,T and P (or a hierarchical relation 
among them), Nash equilibrium is defined as a state - i.e., a set of the choices made 
by all the agents - where no agent can increase her payoff by unilaterally deviating 
from the Nash state. A Nash equilibrium comprising pure strategies may be found 
by a search procedure, whereby each possible state is explicitly examined for the 
above criterion. Note that a given game can have more than one Nash equilibrium, 
possibly involving mixed strategies. In such cases, the choice of a particular equilib¬ 
rium will have to involve additional refinement criteria, which is an important area 
of research in game theory n. 

We now illustrate a dynamical perspective on Nash equilibria by first defining 
payoff functions for all possible mixed strategies of the two agents. Assuming that 
agent A (B) chooses Action 1 with probability pi (p 2 ) and Action 2 with probability 
1 — Pi (1 — p2, respectively), the expected payoffs of the agents are 

11(4 = PlPlR + - P2)S + {1 -P\)P2T + (1 -pi)(l -P2)F, (1) 

Wb = P 1 P 2 R + P\{i -P 2 )T + {1 -pl)p 2 S + {1 -pl){l -p2)P. (2) 

As the payoffs are continuous functions of pi and p 2 , they can be represented as 
two-dimensional surfaces (Fig.|2]i analogous to fitness landscapes in biology or en¬ 
ergy landscapes in physics. However, unlike the latter, there are two distinct surfaces 
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p2 0 0 pi p2 0 0 pi 


Fig. 2 The payoff functions Wa and Wb for two agents playing the game of Chicken, shown as 
functions of pi and p 2 , i.e., the probability of each agent to choose Action 1. The payoffs (in terms 
of the terminology given in Fig.[T) are T — 3.5, R= 1, 5 = 0.5 and P — 0. 


for the two agents, and each of them would like to achieve the maximum of their 
respective payoff functions, a goal that may not be mutually compatible. By con¬ 
trast, the evolution of the state of a physical system can be seen as a convergence 
process to a minimum of a single function, e.g., the free energy that describes the 
entire system. 

Given the payoff function surfaces we can now proceed to find the strategy pairs 
(P*iiP 2 ^ that correspond to a Nash equilibrium. Note that while the Nash solution 
is usually not defined in terms of a dynamical perspective, one can view (P],pp as 
an equilibrium point for flow dynamics in the pi — p 2 plane, as described below. 
The initial condition for this dynamical system can be any arbitrary point in this 
plane. Each agent is then allowed to change its strategy infinitesimally (i.e., pi —^ 
Pi +dpi,p 2 —^ P2 + dp 2 ) in order to improve their respective payoffs, taking into 
consideration that the other agent would also be doing the same. A sequence of such 
incremental changes, which will be manifested as a flow in the pi— p 2 plane would 
eventually converge to an equilibrium point (pj ,/7p. Note that, while such a strategy 
would correspond to a stable equilibrium of the flow dynamics, there may also be 
unstable equilibria. 

The dynamical equations governing the flow can be derived by considering the 
change in the payoffs {dWAjdWs) of the two agents as a result of the infinitesimal 
change in their strategies dpi,dp 2 ' 

^=P2{R-T) + {1-P2)S, 

dpi 

dWn 

^=pi{R-T)^{\-pi)S. (3) 

0P2 

Thus, on any point in the pi — p 2 plane, the magnitude and direction of the flow can 
be obtained by a vector sum of the two components given by Eq. [3 The resulting 
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flow diagram will describe the trajectory in strategy space starting from any arbitrary 
strategy pair {pi,p 2 )- This will be illustrated with specific examples of 2-person 
games in the next section. 

As mentioned earlier, Nash equilibrium is not the only possible solution of a pay¬ 
off symmetric game. Recently, an alternative paradigm referred to as co-action equi¬ 
librium for solving such games has been introduced in the specific context of mi¬ 
nority game E). Here we study this novel solution concept in the context of generic 
2-person games with symmetric payoff from a dynamical perspective. The key no¬ 
tion of co-action equilibrium is: as the two agents are aware that they face an exactly 
symmetric situation, the choice made by agent A should be identical to the choice 
of agent B, assuming that they are equally rational (for a detailed discussion see 
Ref. Is]). Thus, in terms of the flow dynamics introduced above, in this solution 
concept, each agent will take into account in her calculation for revising her strat¬ 
egy that the other agent is not only using the same strategy (i.e., pi = p 2 ) but will 
also make exactly the same infinitesimal change, i.e., dp\ = dp 2 - Then the change 
in the payoffs of the two agents, as a result of changing pi,P 2 (analogous to Eq.[3] 
for Nash equilibrium) is: 


^=2piR+{l-2pi){T + S), 

dpi 

dWn 

^=2p2R+{l-2p2){T + S). (4) 

dp2 

Note that the above equations hold not only when pi = P 2 = P (so that the dynamics 
is confined to the diagonal line in the pi — p 2 plane), but also for situations where 
the two agents initially start with different strategies (pi ^ p 2 ), believing however 
that the other agent is using exactly the same strategy. 

The co-action solution yields results that differ remarkably from those obtained 
using the concept of Nash equilibrium, some of which will be described in the next 
section in the context of specific 2-person games. An important distinction is that 
while there could be multiple Nash equilibria for a game, the corresponding co¬ 
action equilibrium is unique. The dynamical perspective allows us to also distin¬ 
guish between Nash and co-action solutions for 2-person symmetric games in that 
a stable mixed strategy equilibrium is possible for the latter unlike in the former 
(Nash) where a mixed strategy equilibrium, if it exists, is always unstable. 

Note that while the flow diagrams produced by the dynamical process presented 
here may resemble the trajectories generated by solving replicator equations 0, the 
two approaches are essentially distinct. In particular, the latter approach is based on 
the concept of evolutionary stable strategies, which is an equilibrium refinement of 
the Nash solution. Also, instead of being stages in the evolutionary progression of 
a population, the sequence of infinitesimal changes in strategies in the dynamical 
approach presented here, can be interpreted as steps in the deductive reasoning of 
the two agents, at the end of which they choose the strategy corresponding to the 
equilibrium they converge to. When our approach is applied to study the Nash so¬ 
lution of a game, it can also be viewed as an equilibrium refinement as, if there are 
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multiple Nash equilibria, it allows agents to choose a particular equilibrium depend¬ 
ing on the arbitrarily chosen initial state. Thus, in an ensemble of many realizations 
of a game, the fraction of cases where agents will converge to a particular equilib¬ 
rium is proportional to the size of its basin of attraction. An unstable equilibrium 
(if it exists) will lie on the separatrix that demarcates the basins of different stable 
equilibria. 


3 Examples 

We now illustrate the approach outlined in the previous section using three well- 
studied 2-person symmetric games, viz.. Prisoner’s Dilemma, Game of Chicken and 
Stag-Hunt, each of which can be connected to real-life scenarios involving interac¬ 
tions between a pair of agents who have to choose between two possible actions. 
Each of these games is defined in terms of a specific hierarchical relationship be¬ 
tween the payoffs R, S, T and P (using the terminology of the payoff matrix in 
Fig.[D. Without loss of generality, we can set P = 0 and R= \ (thereby fixing the 
origin and the scale for the payoffs), leaving only S and T as free parameters. In 
the following subsections, we discuss each of these games in turn, describing the 
meaning of the different choices available to the agents (viz. Action 1 and Action 2) 
in a particular game, and exploring the different equilibria obtained by using Nash 
and co-action solution concepts. 


3.1 Prisoner’s Dilemma 

Prisoner’s Dilemma (PD) Q can be regarded as one of the most well-known games 
in the literature. It has evoked great interest among researchers from a multitude of 
disciplines ranging from social sciences and politics to biology and physics, from 
the 1950s onwards and continues to do so at present (a good place to read about 
historical developments in PD is the corresponding entry in the online Stanford En¬ 
cyclopedia of Philosophy 13). The game represents the strategic interaction between 
two players who have to choose between cooperation (Action 1) and defection (Ac¬ 
tion 2). The different payoffs are interpreted as follows: P is a “reward” for both 
players cooperating, P is a “punishment” for both players defecting, while, in the 
event that one agent defects while the other cooperates, T and S are the “temptation” 
received by former and the “sucker’s payoff” of the latter. In PD, the hierarchical 
relation between the payoffs isT>R>P>S, which makes achieving mutual co¬ 
operation non-trivial as each player will benefit more by defecting (assuming that 
the other will cooperate). 

It is easy to see that mutual defection is the only Nash equilibrium for PD. As 
Action 1 represents cooperation, p i (p 2 ) corresponds to the probability that agent A 
(B) will choose cooperation. As discussed in the previous section, we can associate 
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Fig. 3 Vector flow diagram representation of the Nash solution of the 2-person Prisoners Dilemma 
game for temptation payoffs (left) T = 1.5 and (right) T — 3.5. The abscissae and ordinate corre¬ 
spond to the probabilities (pi and P 2 ) that players 1 and 2, respectively, choose to cooperate. The 
broken lines represent the best response (or reaction) correspondence of the players (red for player 
1, blue for player 2). The intersection of the lines, represented by a filled circle, represent the single 
Nash equilibrium corresponding to both players defecting (i.e., pi = 0, P 2 = 0). 


a vector with each point in the (_p\,p 2 ) plane for the game which describes the flow 
from that point. Fig. [3 shows the resulting flow diagrams obtained using the Nash 
solution concept for two different values of the temptation payoff T (keeping S 
fixed at —0.5). In both cases, the system converges to the pure strategy pi = P 2 = 0 
(mutual defection), which is the Nash equilibrium for PD. 

By contrast, using the co-action solution concept, for low values of T we observe 
mutual cooperation (i.e., p\ = p 2 = 1) as the stable equilibrium of the system (Fig.lH 
left). For larger values of T, the stable equilibrium corresponds to a mixed strategy, 
0 < Pi = p 2 < 1 (Fig-El right)- Thus, as discussed in detail in Ref. JS), using the co¬ 
action concept for solving PD we can show that selfish agents trying to maximize 
their individual payoffs can also achieve the state of maximum collective benefit. 
This resolves a contentious aspect associated with the Nash solution of PD, where 
the agents end up worse off in trying to optimize their individual payoffs m. 


3.2 Chicken 

The Game of Chicken (also referred to as Snowdrift) 12 is another well-studied 2- 
person game which is relevant in the context of social interactions Eol as well as 
evolutionary biology HU (where it is also known as Hawk-Dove). The game repre¬ 
sents a strategic interaction between two players, who are driving towards each other 
in a potential collision course, and have the choice between “chickening out”, i.e., 
swerving away from the path of the other (Action 1) or continuing on the path (Ac¬ 
tion 2). Thus, the choices correspond to being docile or aggressive, respectively. If 
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Fig. 4 Vector flow diagram representation of the co-action solution of the 2-person Prisoners 
Dilemma game for temptation payoffs (left) T — 1.5 and (right) T = 3.5. The abscissae and or¬ 
dinate correspond to the probabilities (pi and p 2 ) that players 1 and 2, respectively, choose to 
cooperate. The broken line represents the situation where the two agents have the same probability 
of cooperation. The filled circles represent the unique co-action equilibrium for each value of T 
corresponding to the players cooperating with equal probability {= 1 for T = 1.5 and = 0.75 for 
r = 3.5). 


both players decide to swerve away, they receive the payoff R, while if one swerves 
and the other continues on the path, the former loses face (getting the payoff S) 
and the latter wins (payoff T). However, the worst possible outcome corresponds to 
when both players continue on the path, eventually resulting in a collision which is 
associated with payoff P. The hierarchical relation between the payoffs in Chicken 
isT>R>S>P, which suggests that a player will benefit from being aggressive 
as long as the other is docile, but is better off being docile if it is sure that the other 
will play aggressively, as the cost of mutually aggressive behavior is high. 

Analyzing this game using the dynamical perspective described earlier yields the 
flow diagram shown in Fig 15] (obtained for two different values of T, with S = 0.5) 
on using the Nash solution concept. As can be seen, two of the multiple Nash equi¬ 
libria are stable, corresponding to the pure strategies (i) pi = I,p 2 = 0 and (ii) 
Pi = 0,p2 = 1 (i e., when one player is aggressive, the other is docile). The remain¬ 
ing equilibrium is an unstable mixed strategy located on the p\ = P 2 line (which 
defines the separatrix demarcating the basins of attraction of the two stable equilib¬ 
ria). With increasing T, the unstable equilibrium - which dynamically corresponds 
to a saddle point in the p\ — p 2 plane - moves closer to p\ =Q,p 2=0 corresponding 
to mutual aggression. 

Using the co-action solution concept gives rise to a qualitatively different solu¬ 
tion, as seen in the flow diagrams in Fig. |6] When T is low, the system has a stable 
equilibrium at pi = I, P 2 = 1, i-e., both agents choose docile behavior to avoid the 
potential damages associated with mutual aggression. For higher values of T the 
stable equilibrium is a mixed strategy 0 < pi = p 2 < 1- As in PD, the co-action 
paradigm yields a single, stable solution of the game. 
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Fig. 5 Vector flow diagram representation of the Nash solution of the 2-person Chicken game 
for “temptation” payoffs (left) 7 = 1.5 and (right) T = 3.5. The abscissae and ordinate corre¬ 
spond to the probabilities (pi and p 2 ) that players 1 and 2, respectively, choose to be docile (i.e., 
non-aggressive). The broken lines represent the best response (or reaction) correspondence of the 
players (red for player 1, blue for player 2). The intersections of the lines, represented by unfilled 
and filled circles, represent the unstable and stable Nash equilibria respectively. The stable equilib¬ 
ria correspond to the pure strategy combination corresponding to one player being aggressive, the 
other being docile, while the unstable equilibrium in each case corresponds to a mixed strategy. 
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Fig. 6 Vector flow diagram representation of the co-action solution of the 2-person Chicken game 
for “temptation” payoffs (left) T = 1.5 and (right) T = 3.5. The abscissae and ordinate corre¬ 
spond to the probabilities (pi and p 2 ) that players 1 and 2, respectively, choose to be docile (i.e., 
non-aggressive). The broken line represents the situation where the two agents have the same prob¬ 
ability of being docile. The filled circles represent the unique co-action equilibrium for each value 
of T corresponding to the players choosing to be docile with equal probability (= 1 for T =1.5 
and = 2/3 for T = 3.5). 
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Fig. 7 Vector flow diagram representation of the (left) Nash and (right) co-action solutions of the 
2-person Stag-Hunt game for T = 0.5 and S — —0.5. The abscissae and ordinate correspond to the 
probabilities (p\ and P 2 ) that players 1 and 2, respectively, choose ‘Stag’ instead of ‘Hare’, (left) 
The broken lines represent the best response (or reaction) correspondence of the players (red for 
player 1, blue for player 2). The broken line represents the situation where the two agents have the 
same probability of choosing ’stag’. 


3.3 Stag-Hunt 

The last of our examples, Stag-Hunt is a 2-person game that has been studied in 
the context of emergence of coordination in social interactions na. The game rep¬ 
resents a strategic interaction between two players who have to choose between 
hunting stag (Action 1) or hunting hare (Action 2). A hare may be caught by a sin¬ 
gle agent but is worth less than a stag. On the other hand, hunting a stag is successful 
only if both agents hunt for it. Thus, if both agents cooperate by hunting stag they 
receive the highest payoff R. On the other hand, if they both choose to hunt hare, 
they receive the payoff P. However, if one chooses to hunt hare while the other goes 
for a stag, then the former receives the payoff T while the latter receives the worst 
possible payoff S. Thus, in Stag-Hunt, the hierarchical relation between the payoffs 
is R > T > P > S, which suggests that while choosing to hunt hare may be the safer 
option, there is a possibility of doing much better by choosing to hunt stag if one is 
confident that the other will also do the same. 

The vector flow diagrams for Nash and co-action solution concepts in the Stag- 
Hunt are shown in Fig]?] (obtained for T = 0.5 and S = —0.5). For Nash, as in the 
game of Chicken, there are three equilibria (Fig.|7] left), of which the pure strategies, 
corresponding to (i) = 1 ,/72 = 1 and (ii) pi = 0,/72 = 0 are stable (i.e., when both 

players hunt for stag or when both players hunt hare). The remaining equilibrium 
is an unstable mixed strategy located on the p\ = p 2 line which again defines the 
separatrix demarcating the basins of attraction of the two stable equilibria. 

The co-action solution for the games (Fig. |7] right) is a simple one in which 
both agents always choose hunting stag, i.e, pi = P 2 = 1- Thus, under the co-action 
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concept, the players always converge to the best possible outcome. In this case, there 
is no mixed strategy equilibrium for any value of the parameters. 


4 Conclusions 

In this article we have shown that using a dynamical perspective allows us a visu¬ 
ally appealing way to differentiate between two solution concepts, viz., Nash and 
co-action, for 2-person, symmetric games which lead to spectacularly different con¬ 
clusions. To illustrate these differences in details we looked at three examples of 
such games in detail: Prisoners Dilemma, Chicken and Stag-Hunt. In all of these 
games, one action - in particular. Action 1 in the terminology used here - corre¬ 
sponds to the players being “nicer” to each other (e.g., cooperating in PD, etc.) 
compared to the other action. The vector flow diagrams generated by the approach 
presented here clearly show that co-action more often results in nicer strategies be¬ 
ing converged at by the agents than in the case for Nash. Our results are intriguing in 
view of the experimental literature on 2-person games (see discussion in Ref. ii), 
in particular PD, which seems to suggest that when these games are played between 
real human individuals they tend to be far nicer than suggested by the Nash solution. 
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